19 research outputs found

    Intermediate Representations for Controllers in Chip Generators

    Get PDF
    Creating parameterized “chip generators” has been proposed as one way to decrease chip NRE costs. While many approaches are available for creating or generating flexible data path elements, the design of flexible controllers is more problematic. The most common approach is to create a microcoded engine as the controller, which offers flexibility through programmable table-based lookup functions. This paper shows that after “programming” the hardware for the desired application, or applications, these flexible controller designs can be easily converted to efficient fixed (or less programmable) solutions using partial evaluation capabilities that are already present in most synthesis tools

    Long-term modification of cortical synapses improves sensory perception

    Get PDF
    Synapses and receptive fields of the cerebral cortex are plastic. However, changes to specific inputs must be coordinated within neural networks to ensure that excitability and feature selectivity are appropriately configured for perception of the sensory environment. Long-lasting enhancements and decrements to rat primary auditory cortical excitatory synaptic strength were induced by pairing acoustic stimuli with activation of the nucleus basalis neuromodulatory system. Here we report that these synaptic modifications were approximately balanced across individual receptive fields, conserving mean excitation while reducing overall response variability. Decreased response variability should increase detection and recognition of near-threshold or previously imperceptible stimuli, as we found in behaving animals. Thus, modification of cortical inputs leads to wide-scale synaptic changes, which are related to improved sensory perception and enhanced behavioral performance

    Reconstructing a 3d line from a single catadioptric image

    No full text
    This paper demonstrates that, for axial non-central optical systems, the equation of a 3D line can be estimated using only four points extracted from a single image of the line. This result, which is a direct consequence of the lack of vantage point, follows from a classic result in enumerative geometry: there are exactly two lines in 3-space which intersect four given lines in general position. We present a simple algorithm to reconstruct the equation of a 3D line from four image points. This algorithm is based on computing the Singular Value Decomposition (SVD) of the matrix of PlĂĽcker coordinates of the four corresponding rays. We evaluate the conditions for which the reconstruction fails, such as when the four rays are nearly coplanar. Preliminary experimental results using a spherical catadioptric camera are presented. We conclude by discussing the limitations imposed by poor calibration and numerical errors on the proposed reconstruction algorithm.

    Abstract Opening the Sensornet Black Box

    No full text
    We argue that the principal cause of sensornet deployment and development difficulty is an inability to observe a network’s internal operation. We further argue that this lack of visibility is due to the activity and resource constraints enforced by limited energy. We present the Mote Network (MNet) architecture, which elevates visibility to be its dominant design principle. We propose a quantitative metric for network visibility and explain why network isolation and fairness are critical concerns. We describe the Fair Waiting Protocol (FWP), MNet’s single-hop protocol and show how its fairness and isolation can improve throughput and efficiency. We present the Pull Collection Protocol as a case study in designing multihop protocols in the architecture.

    Verification of Chip Multiprocessor Memory Systems Using A Relaxed Scoreboard

    No full text
    Verification of chip multiprocessor memory systems remains challenging. While formal methods have been used to validate protocols, simulation is still the dominant method used to validate memory system implementation. Having a memory scoreboard, a high-level model of the memory, greatly aids simulation based validation, but accurate scoreboards are complex to create since often they depend not only on the memory and consistency model but also on its specific implementation. This paper describes a methodology of using a relaxed scoreboard, which greatly reduces the complexity of creating these memory models. The relaxed scoreboard tracks the operations of the system to maintain a set of values that could possibly be valid for each memory location. By allowing multiple possible values, the model used in the scoreboard is only loosely coupled with the specific design, which decouples the construction of the checker from the implementation, allowing the checker to be used early in the design and to be built up incrementally, and greatly reduces the scoreboard design effort. We demonstrate the use of the relaxed scoreboard in verifying RTL implementations of two different memory models, Transactional Coherency and Consistency (TCC) and Relaxed Consistency, for up to 32 processors. The resulting checker has a performance slowdown of 19 % for checking Relaxed Consistency, and less than 30% for TCC, allowing it to be used in all simulation runs. 1

    B.7.2 [Hardware]: Integrated Circuits – Design Aids

    No full text
    The drive for low-power, high performance computation coupled with the extremely high design costs for ASIC designs, has driven a number of designers to try to create a flexible, universal computing platform that will supersede the microprocessor. We argue that these flexible, general computing chips are trying to accomplish more than is commercially needed. Since design NRE costs are an order of magnitude larger than fabrication NRE costs, a two-step design system seems attractive. First, the users configure/program a flexible computing framework to run their application with the desired performance. Then, the system “compiles ” the program and configuration, tailoring the original framework to create a chip that is optimized toward the desired set of applications. Thus the user gets the reduced development costs of using a flexible solution with the efficiency of a custom chip

    Understanding sources of inefficiency in general-purpose chips

    No full text
    Due to their high volume, general-purpose processors, and now chip multiprocessors (CMPs), are much more cost effective than ASICs, but lag significantly in terms of performance and energy efficiency. This paper explores the sources of these performance and energy overheads in general-purpose processing systems by quantifying the overheads of a 720p HD H.264 encoder running on a general-purpose CMP system. It then explores methods to eliminate these overheads by transforming the CPU into a specialized system for H.264 encoding. We evaluate the gains from customizations useful to broad classes of algorithms, such as SIMD units, as well as those specific to particular computation, such as customized storage and functional units. The ASIC is 500x more energy efficient than our original fourprocessor CMP. Broadly, applicable optimizations improve performance by 10x and energy by 7x. However, the very low energy costs of actual core ops (100s fJ in 90nm) mean that over 90 % of the energy used in these solutions is still “overhead”. Achieving ASIC-like performance and efficiency requires algorithm-specific optimizations. For each sub-algorithm of H.264, we create a large, specialized functional unit that is capable of executing 100s of operations per instruction. This improves performance and energy by an additional 25x and the final customized CMP matches an ASIC solution’s performance within 3x of its energy and within comparable area
    corecore